56 research outputs found

    Efficient Incremental View Maintenance for Data Warehousing

    Get PDF
    Data warehousing and on-line analytical processing (OLAP) are essential elements for decision support applications. Since most OLAP queries are complex and are often executed over huge volumes of data, the solution in practice is to employ materialized views to improve query performance. One important issue for utilizing materialized views is to maintain the view consistency upon source changes. However, most prior work focused on simple SQL views with distributive aggregate functions, such as SUM and COUNT. This dissertation proposes to consider broader types of views than previous work. First, we study views with complex aggregate functions such as variance and regression. Such statistical functions are of great importance in practice. We propose a workarea function model and design a generic framework to tackle incremental view maintenance and answering queries using views for such functions. We have implemented this approach in a prototype system of IBM DB2. An extensive performance study shows significant performance gains by our techniques. Second, we consider materialized views with PIVOT and UNPIVOT operators. Such operators are widely used for OLAP applications and for querying sparse datasets. We demonstrate that the efficient maintenance of views with PIVOT and UNPIVOT operators requires more generalized operators, called GPIVOT and GUNPIVOT. We formally define and prove the query rewriting rules and propagation rules for such operators. We also design a novel view maintenance framework for applying these rules to obtain an efficient maintenance plan. Extensive performance evaluations reveal the effectiveness of our techniques. Third, materialized views are often integrated from multiple data sources. Due to source autonomicity and dynamicity, concurrency may occur during view maintenance. We propose a generic concurrency control framework to solve such maintenance anomalies. This solution extends previous work in that it solves the anomalies under both source data and schema changes and thus achieves full source autonomicity. We have implemented this technique in a data warehouse prototype developed at WPI. The extensive performance study shows that our techniques put little extra overhead on existing concurrent data update processing techniques while allowing for this new functionality

    Big Data Fusion Model for Heterogeneous Financial Market Data (FinDF)

    Get PDF
    The dawn of big data has seen the volume, variety, and velocity of data sources increase dramatically. Enormous amounts of structured, semi-structured and unstructured heterogeneous data can be garnered at a rapid rate, making analysis of such big data a herculean task. This has never been truer for data relating to financial stock markets, the biggest challenge being the 7 Vs of big data which relate to the collection, pre-processing, storage and real-time processing of such huge quantities of disparate data sources. Data fusion techniques have been adopted in a wide number of fields to cope with such vast amounts of heterogeneous data from multiple sources and fuse them together in order to produce a more comprehensive view of the data and its underlying relationships. Research into the fusing of heterogeneous financial data is scant within the literature, with existing work only taking into consideration the fusing of text-based financial documents. The lack of integration between financial stock market data, social media comments, financial discussion board posts and broker agencies means that the benefits of data fusion are not being realised to their full potential. This paper proposes a novel data fusion model, inspired by the data fusion model introduced by the Joint Directors of Laboratories, for the fusing of disparate data sources relating to financial stocks. Data with a diverse set of features from different data sources will supplement each other in order to obtain a Smart Data Layer, which will assist in scenarios such as irregularity detection and prediction of stock prices

    Neuromatch Academy: a 3-week, online summer school in computational neuroscience

    Get PDF
    Neuromatch Academy (https://academy.neuromatch.io; (van Viegen et al., 2021)) was designed as an online summer school to cover the basics of computational neuroscience in three weeks. The materials cover dominant and emerging computational neuroscience tools, how they complement one another, and specifically focus on how they can help us to better understand how the brain functions. An original component of the materials is its focus on modeling choices, i.e. how do we choose the right approach, how do we build models, and how can we evaluate models to determine if they provide real (meaningful) insight. This meta-modeling component of the instructional materials asks what questions can be answered by different techniques, and how to apply them meaningfully to get insight about brain function

    Neuromatch Academy: a 3-week, online summer school in computational neuroscience

    Get PDF

    Fast optimal twig joins

    No full text
    • …
    corecore